Subband architecture for automatic speaker recognition
نویسندگان
چکیده
We present an original approach for automatic speaker identification especially applicable to environments which cause partial corruption of the frequency spectrum of the signal. The general principle is to split the whole frequency domain into several subbands on which statistical recognizers are independently applied and then recombined to yield a global score and a global recognition decision. The choice of the subband architecture and the recombination strategies are particularly discussed. This technique had been shown to be robust for speech recognition when a narrow band noise degradation occur. We first objectively verify this robustness for the speaker identification task. We also study which information is really used to recognize speakers. For this, speaker identification experiments on independent subbands are conducted for 630 speakers, on TIMIT and NTIMIT databases. The results show that the speaker specific information is not equally distributed among subbands. In particular, the low-frequency subbands (under 600Hz) and the high-frequency subbands (over 3000Hz) are more speaker specific than middle-frequency ones. In addition, experiments on different subband system architectures show that the correlations between frequency channels are of prime importance for speaker recognition. Some of these correlations are lost when the frequency domain is divided into subbands. Consequently we propose a particularly redundant parallel architecture for which most of the correlations are kept. The performances obtained with this new system, using linear recombination strategies, are equivalent to those of a conventional fullband recognizer on clean and telephone speech. Experiments on speech corrupted by unpredictable noise show a better adaptability of this approach in noisy environments, compared to a conventional device, especially when pruning of some recognizers is performed.
منابع مشابه
Efficient Training of GMM Based Speaker Recognition System
Automatic speaker recognition (ASR) is based on speech feature vectors, models, and classifiers. To improve the speaker recognition performance, we must affect at least one of these modules. In this paper we propose to use subband spectral centroids (SSCs) as a complementary features with the traditional MFCC features, and a new GMM training algorithm, with the ultimate goal to search the bette...
متن کاملImproving speaker identification in noise by subband processing and decision fusion
We investigate speaker identification in narrowband noise using subband processing. The output of each subband is used to train and test individual hidden Markov models (HMMs), each making a preliminary decision on speaker identity. Subsequently, these are combined to produce a final decision. For sufficient numbers of filters, subband processing outperforms traditional wideband techniques by a...
متن کاملInformation Fusion for Subband-HMM Speaker Recognit ion
Previous work has demonstrated the performance gains that can be obtained in speaker recognition by applying subband processing, together with hidden Markov modelling and multiple classifier recombination. Two recombination rules have been investigated: the sum of log likelihoods, which corresponds to the optimal Bayes’ rule under certain constraints, and multilayer perceptrons (MLP), which are...
متن کاملSpeaker normalized spectral subband parameters for noise robust speech recognition
This paper proposes speaker normalized spectral subband centroids (SSCs) as supplementary features in noise environment speech recognition. SSCs are computed as frequency centroids for each subband from the power spectrum of the speech signal. Since the conventional SSCs depend on formant frequencies of a speaker, we introduce a speaker normalization technique into SSC computation to reduce the...
متن کاملNoise - Robust Speaker Recognition Using Subband Likelihoods and Reliable - Feature Selection
Sungtak Kim et al. 89 We consider the feature recombination technique in a multiband approach to speaker identification and verification. To overcome the ineffectiveness of conventional feature recombination in broadband noisy environments, we propose a new subband feature recombination which uses subband likelihoods and a subband reliable-feature selection technique with an adaptive noise mode...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Signal Processing
دوره 80 شماره
صفحات -
تاریخ انتشار 2000